Genome Characterization and Probiotic Potential of Corynebacterium amycolatum Human Vaginal Isolates

The vaginal microbiome of healthy women contains nondiphtheria corynebacteria. The role and functions of nondiphtheria corynebacteria in the vaginal biotope are still under study. We sequenced and analysed the genomes of three vaginal C. amycolatum strains isolated from healthy women. Previous studies have shown that these strains produced metabolites that significantly increased the antagonistic activity of peroxide-producing lactic acid bacteria against pathogenic and opportunistic microorganisms and had strong antimicrobial activity against opportunistic pathogens. Analysis of the C. amycolatum genomes revealed the genes responsible for adaptation and survival in the vaginal environment, including acid and oxidative stress resistance genes. The genes responsible for the production of H2O2 and the synthesis of secondary metabolites, essential amino acids and vitamins were identified. A cluster of genes encoding the synthesis of bacteriocin was revealed in one of the annotated genomes. The obtained results allow us to consider the studied strains as potential probiotics that are capable of preventing the growth of pathogenic microorganisms and supporting colonisation resistance in the vaginal biotope.


Introduction
The vaginal microbiome is an open complex multicomponent system in dynamic equilibrium [1]. The vaginal microbiome is represented by various microbial communities containing bacteria that can synthesize organic acids, including lactic acid, maintaining a vaginal pH of 3.8-4.4, thereby supporting women's health [2,3]. It is generally accepted that the main microorganism responsible for maintaining the stability of the vaginal microbiome is the dominant microbe lactobacilli. Lactobacilli produce lactate, hydrogen peroxide and various bacteriocins and bacteriocin-like substances, thereby inhibiting the growth of obligate anaerobes and opportunistic microorganisms [2][3][4].
However, using culture-independent methods based on sequencing of the 16S (rRNA) gene, researchers demonstrated that a significant proportion (7-33%) of healthy women lack lactobacilli in their vagina [5,6]. It is known that the absence of lactobacilli is accompanied by the presence of other microorganisms, such as Gardnerella vaginalis, or various species of Peptostreptococcus spp., Prevotella spp., Pseudomonas spp., Streptococcus spp. and/or Corynebacterium spp. Such changes in the structure of the vaginal microbiome are not considered a pathological disorder [7,8].
The Corynebacterium genus contains approximately 130 different species of diverse and ecologically significant microorganisms. The well-known typical representatives of this genus are the pathogenic species C. diphtheriae, C. ulcerans, and C. pseudotuberculosis [9], and the roles of these species in the development of human infection have been proven. In addition, a large group of nondiphtheria corynebacteria is part of the resident microflora of human skin and mucous membranes and most often stands out from clinical samples [10]. strains of nondiphtheria corynebacteria in protecting human mucous membranes from infection have increased. It was shown that certain types of nondiphtheria corynebacteria produce various bacteriocins, bacteriocin-like substances and biosurfactants, which inhibit the growth of opportunistic microorganisms and their biofilm formation [11,12]. Individual strains have pronounced bactericidal activity against opportunistic microorganisms, including MRSA [13]. Certain strains of nondiphtheria corynebacteria were recommended for use as probiotic microorganisms [14]. There are few reports on the use of nondiphtheria corynebacteria as immunomodulators in tumour immunotherapy [15].
Nondiphtheriae corynebacteria in vaginal biotopes were found in women regardless of age and microecological status [16][17][18]. Nondiphtheriae corynebacteria along with Staphylococcus epidermidis constitute the main part, approximately 80% of the vaginal microbiota, in prepubescent girls [19]. The number of nondiphtheriae corynebacteria in pregnant and postpartum women is also increased [20]. Despite the high frequency of nondiphtheriae corynebacteria occurrence in the female genital tract, studies on this topic are limited mainly to the description of pathogens [21,22].
Corynebacterium amycolatum was isolated for the first time by Collins and Burton from clinical specimens in 1988 [23]. Based on our observations, C. amycolatum is rather frequently isolated from vaginal biotopes of healthy women, and features a high probiotic potential. Particularly, we isolated three strains of corynebacteria from the vaginal contents of healthy women. All of them were identified as C. amycolatum. Metabolites of these strains greatly increased the antagonistic activity of peroxide-producing lactobacilli against pathogenic and opportunistic microorganisms and had strong antimicrobial activity against opportunistic pathogens such as Escherichia coli, Staphylococcus aureus, Klebsiella pneumoniae and Pseudomonas aeruginosa [24,25]. These strains showed the greatest adhesive ability to vaginal epithelial cells and human fibronectin under low pH conditions [25]. Due to their useful properties, we selected these strains for high-throughput sequencing (HTS) genome annotation. To better understand the ability of corynebacteria to survive in the vaginal biotope under eubiosis and to show their beneficial properties, we sequenced and analysed the genomes of three isolated strains. In addition, we should note that the genomic characteristics of Corynebacterium amycolatum have not yet been previously described.

DNA Preparation, Genome Sequencing and Assembly
Strains of C. amycolatum ICIS 5, ICIS 9 and ICIS 53 were previously isolated from vaginal smears of healthy women of reproductive age. The strains are deposited in the Collection of Microorganisms of the Institute for Cellular and Intracellular Symbiosis UrB RAS (Orenburg, Russia) under the same accession names. The phenotypic characteristics of these isolates have been previously described in detail [24,25]. The strains were kept at −80 • C in 20% (v/v) glycerol before experiment. The isolates were grown in tryptic soy broth (TSB) at 37 • C for 24 h.
Overnight bacterial cultures were used for extraction genomic DNA with the phenolchloroform method. DNA libraries were prepared and sequenced at the Center of Shared Scientific Equipment "Persistence of microorganisms" at the Institute for Cellular and Intracellular Symbiosis UrB RAS (Orenburg, Russia). The Nextera XT DNA library preparation kit (Illumina, San Diego, CA, USA) was used according to the manufacturer's instructions. High-throughput sequencing of the DNA libraries was carried out in the MiSeq sequencer (Illumina, USA) using the MiSeq reagent kit v3 2 × 300 cycles (Illumina, USA). The reads were quality-trimmed with the Trimmomatic tool [26]. De novo genome assembly was carried out with SPAdes (version 3.9.0) [27].

Genome Annotation
Functional annotation of the genomes was carried out by the RAST server (Rapid Annotation using Subsystem Technology) [28] and NCBI Prokaryotic Genome Annotation Pipeline (PGAAP) [29]. Clusters of orthologous groups (COGs) of proteins were used  [30]. The bioinformatic tools BAGEL4 [31] and AntiSMASH 5 [32] were used to determine potential clusters of secondary metabolites with antimicrobial activity. Antibiotic resistance genes in the genomes were predicted using the RGI (Resistance Gene Identifier) tool [33]. The presence of putative virulence genes in the genomes was investigated using the Virulence Factor of Bacterial Pathogens Database (VFDB) [34]. The CRISPR regions were identified with a CRISPR online detection tool, CRISPR finder [35].

Phylogenetic Analysis
Phylogenetic analysis was conducted based on the 16S rDNA sequences retrieved from draft genomes of the C. amycolatum strains ICIS 5 (WGS Project: SSOR01), ICIS 9 (MTPT01) and ICIS 53 (MIFV01), draft genomes of nine C. amycolatum strains currently available in the NCBI database (Supplementary Table S1 Genome similarity of the strains ICIS 5, ICIS 9 and ICIS 53 and other C. amycolatum strains currently available at NCBI was determined by calculating the average nucleotide identity (ANI) and orthologous average nucleotide identity (OrthoANI) using OAT (version v. 0.93.1) software [39].

Nucleotide Sequence Accession Numbers
The annotated genome sequences were deposited in the GenBank database as sequencing project PRJNA339674 with accession numbers SSOR00000000, MTPT00000000 and MIFV00000000 for C. amycolatum ICIS 5, ICIS 9 and ICIS 53, respectively. The strains C. amycolatum ICIS 9 and ICIS 53 were deposited in the culture collection of the All-Russian Collection of Microorganisms at the G.K. Skryabin Institute of Biochemistry and Physiology of Microorganisms (Russian Academy of Sciences, Pushchino, Russia) under registration no. VKM Ac-2843D and VKM Ac-2844D, respectively.

General Genome Features
As shown in Table 1, the draft genome of strain ICIS 5 was composed of 2,474,151 bp, with an N50 length of 164,886 bp, an L50 of 6, and a G + C content of 58.8%. The final assembled genome consisted of 115 contigs.
Genome annotation was performed using the National Center for Biotechnology Information (NCBI) Prokaryotic Genome Annotation Pipeline (PGAP) (http://www.ncbi. nlm.nih.gov/genome/annotation_prok (19 December 2021)), and 2195 coding sequences, including 2062 proteins (CDSs), 47 pseudogenes, complete rRNAs (3, 1 (5S, 16S) and 53 tRNAs, were identified. The identified coding proteins were classified into 26 functional categories based on COG classification (Supplementary Table S2). Of the 2062 proteincoding genes in ICIS 5, 1908 were assigned to COGs, and 154 genes were not assigned. The percentage of proteins with unknown function, including "Function unknown (S)" and "Not assigned (−)", was 31.7%. Most genes belonged to the categories: Amino acid transport and metabolism (7.27% of CDS), Inorganic ion transport and metabolism (6.98% of CDS), Translation, ribosomal structure and biogenesis (6.84% of CDS), Replication, recombination and repair (6.3% of CDS) and Transcription (4.8% of CDS). Pseudo Genes (total) 47 53 34 The genome of strain ICIS 9 was slightly larger than that of ICIS 5. It was composed of 2,587,830 bp, with an N50 length of 45,496 bp, an L50 of 18 and a G + C content of 58.6%. The final assembled genome consisted of 181 contigs. Genome annotation identified 2392 coding sequences, including 2277 proteins, 53 pseudogenes, complete rRNAs 1, 1, 1 (5S, 16S, 23S) and 53 tRNAs (Table 1). The identified coding proteins were classified into 26 functional categories based on COG classification (Supplementary Table S2). Of the 2277 proteincoding genes in ICIS 9, 2044 were assigned to COGs, and 233 genes were not assigned. The percentage of proteins with unknown function, including "Function unknown (S)" and "Not assigned (-)", was 33.2%. Unlike ICIS 5, the distribution of identified coding proteins into categories based on COG classification was as follows: genes were mostly involved in the categories, Replication, recombination and repair (10.01% of CDS), Inorganic ion transport and metabolism (6.19% of CDS), Translation, ribosomal structure and biogenesis (6.19% of CDS), Amino acid transport and metabolism (6.05% of CDS) and Coenzyme transport and metabolism (4.61% of CDS).
The draft genome of strain ICIS 53 was composed of 2,460,257 bp, with an N50 length of 170,410 bp, an L50 of 4, and a G + C content of 59.0%. The final assembled genome consisted of 41 contigs. Genome annotation identified 2173 coding sequences, including 2076 proteins, 34 pseudogenes, 5, 1, and 1 complete rRNAs (5S, 16S, 23S) and 53 tRNAs (Table 1). The identified coding proteins were classified into 26 functional categories based on COG classification (Supplementary Table S2). Of the 2076 proteincoding genes in ICIS 53, 1884 were assigned to COGs, and 189 genes were not assigned. The percentage of proteins with unknown function, including "Function unknown (S)" and "Not assigned (-)", was 33.9%. The identified coding proteins according to the COG classification were distributed as follows: Amino acid transport and metabolism (7.32% of CDS), Inorganic ion transport and metabolism (6.65% of CDS), Translation, ribosomal structure and biogenesis (6.55% of CDS), Replication, recombination and repair (4.82% of CDS) and Energy production and conversion (4.72% of CDS).

Phylogenetic Analysis
As shown in the phylogenetic tree constructed based on the 16S rRNA gene sequences, strains ICIS 5, ICIS 9, and ICIS 53 formed a common clade with nine strains of C. amycolatum from the NCBI database. The, C. amycolatum clade with a sister branch represented by Corynebacterium xerosis ATCC 373T was clearly separated from another clade containing all other species of Corynebacterium spp. (Figure 1). The data obtained are in good agreement with those described previously [40,41]. distributed as follows: Amino acid transport and metabolism (7.32% of CDS), Inorganic ion transport and metabolism (6.65% of CDS), Translation, ribosomal structure and biogenesis (6.55% of CDS), Replication, recombination and repair (4.82% of CDS) and Energy production and conversion (4.72% of CDS).

Phylogenetic Analysis
As shown in the phylogenetic tree constructed based on the 16S rRNA gene sequences, strains ICIS 5, ICIS 9, and ICIS 53 formed a common clade with nine strains of C. amycolatum from the NCBI database. The, C. amycolatum clade with a sister branch represented by Corynebacterium xerosis ATCC 373T was clearly separated from another clade containing all other species of Corynebacterium spp. (Figure 1). The data obtained are in good agreement with those described previously [40,41]. The similarity scores between ICIS 5, ICIS 9 and ICIS 53 and other C. amycolatum strains exceed 99% based on 16S rRNA gene phylogeny. The average nucleotide identity The similarity scores between ICIS 5, ICIS 9 and ICIS 53 and other C. amycolatum strains exceed 99% based on 16S rRNA gene phylogeny. The average nucleotide identity (ANI) between ICIS 5, ICIS 9 and ICIS 53 ranged from 96.79% to 97.87%, and the average nucleotide orthology (OrthoANI) ranged from 96.89% to 97.93% (Table 2).  The vaginal ecosystem is an aggressive environment for most microorganisms. The acidic environment in the vagina creates a natural filter; as a result, most pathogens and opportunistic microbes die. In order to survive and successfully colonise this ecosystem, microorganisms of the genus Corynebacterium spp. must have evolved mechanisms of adaptation. In the studied genomes, we identified a large number of genes encoding proteins involved in stress response. These stresses included pH, temperature, osmotic pressure, nitrosative and oxidative stress. The detailed analysis of genes coding for proteins involved in stress response in the genomes of ICIS 5, ICIS 9 and ICIS 53 is shown in Table 3.  ICIS 5, ICIS 9 and ICIS 53 contain 7 genes that encode F0F1-ATPase. Membrane-bound ATP synthases (F0F1-ATPases) of bacteria serve two important physiological functions. The enzyme catalyses the synthesis of ATP from ADP and inorganic phosphate utilizing the energy of an electrochemical ion gradient. On the other hand, under conditions of low driving force, ATP synthases function as ATPases, thereby generating a transmembrane ion gradient at the expense of ATP hydrolysis [43]. Such activity protects cells from damage induced by an acidic environment; 3 genes encoding Na + /H + antiporters are membrane proteins that play a major role in pH and Na + homeostasis of cells [44]. The analysis of the genomes revealed the presence of genes that encode L-lactate dehydrogenase. This enzyme catalyses the conversion of lactate to pyruvate with the formation of NADH. As a result, it restored the NAD/NADH balance and subsequently increased ATP production. The concomitant surplus of ATP is used to drive the F0F1-ATPases, resulting in enhanced acid tolerance in bacteria [45]. Furthermore, the ICIS 5, ICIS 9 and ICIS 53 genomes encode glucose-6-phosphate isomerase, GTP pyrophosphokinase, pyruvate kinase, ATP-dependent Clp protease ATP-binding subunit, which are proteins involved in the acid resistance of various bacteria [46,47]. We also identified a number of genes related to temperature stress. A cluster of heat shock proteins was identified, hrcA-grpE-dnaK-dnaJ and chaperonin system GroEL-GroES, which are present in all kingdoms of life and rescue proteins from improper folding and aggregation upon internal and external stress conditions, including high temperatures and pressures [48,49].
A large number of genes associated with oxidative stress were identified in the studied genomes. They can play a crucial significance for survival and adaptation of the bacteria in the vaginal niche. The genomes contain genes that encode catalase, thiol peroxidase and glutathione peroxidase, which are antioxidant protective enzymes capable to detoxify reactive oxygen species [50][51][52]. The genomes also harbour genes encoding superoxide dismutase (SOD) and the complete thioredoxin system. It is known that SOD and thioredoxin (Trx) systems are key antioxidant systems in cellular protection against oxidative stress conditions [53][54][55][56]. In addition, genes nrdH and MSH encoding glutaredoxin and mycothiol, respectively, were identified. Similar to glutathione, mycothiol is one of key metabolites providing protection of bacteria from oxidative stress, as well as detoxication of xenobiotics [57,58]. Recently, significance of MSH has been shown for resistance of Corynebacterium glutamicum to antibiotics, alkylating agents, ethanol and heavy metals [59,60]. The genomes of strains ICIS 5, ICIS 9 and ICIS 53 contained genes encoding four enzymatic steps of mycothiol biosynthesis: production of GlcNAc-Ins-P using D-inositol-3phosphate glycosyltransferase (MshA), deacetylation using N-acetyl-1-D-myo-inositol-2amino-2-deoxy-alpha-D-glucopyranoside deacetylase (MshB) to form GlcN-Ins, binding to cysteine via cysteine-1-D-myo-inosityl-2-amino-2-deoxy-alpha-D-glucopyranoside ligase (MshC), and acetylation of mycothiol synthase (MshD) to give MSH.

Biologically Active Secondary Metabolite-Related Genes
In order to survive and successfully colonise the vaginal biotope, nonpathogenic corynebacteria must have the ability to produce secondary metabolites with antimicrobial activity and determine their competitive advantage [61]. In the studied genomes, we found the presence of gene clusters potentially involved in the biosynthesis of secondary metabolites. Gene clusters were predicted for T3pks (type III polyketide synthases), Nrps (nonribosomal peptide), Nrps-like and terpene. Each of the three genomes contained one T3pks gene cluster, which was associated with the biosynthesis of polyketides. Polyketides are natural metabolites that comprise the basic chemical structure of various anticancer, antifungal and anticholesteremic agents, antibiotics, parasiticides and immunomodulators [62,63]. These T3pks gene clusters encoded the biosynthesis of merochlorin A-D-like compounds ( Supplementary Figures S1-S3). Merochlorins A-D, cyclic meroterpenoid antibiotics, were first described in the marine bacterium Streptomyces sp. strain CNH-189 [64]. The genomes of strains ICIS 5 and ICIS 53 contained one Nrps gene cluster, which was associated with the biosynthesis of phthoxazolin-like compounds ( Supplementary  Figures S4 and S5). Phthoxazolin, an oxazole-containing polyketide, has a broad spectrum of anti-oomycete activity and herbicidal activity [65]. Additionally, each genome contained one terpene gene cluster and an Nrps-like gene cluster, but substances were not identified. Terpenes or isoprenoids are the largest and structurally most diverse class of secondary metabolites. Terpenes are involved in a wide range of vital biological functions, including electron transport, cellular respiration, photosynthesis, membrane biosynthesis, signalling and growth regulation [66]. In accordance with their structural diversity, the functions of terpenoids range from mediating symbiotic or antagonistic interactions between organisms to electron transfer, protein prenylation, or contribution to membrane fluidity [67]. In addition, an increasing number of terpenes have been utilised for pharmaceuticals [68][69][70]. We checked the "Terpenoid backbone biosynthesis (map00900)" pathway in the genomes of strains ICIS 5, ICIS 9 and ICIS 53 and identified six key enzymes distributed in the mevalonate (MVA) pathway. The obtained data confirm the previously described MVA pathway for terpenoid backbone biosynthesis in C. amycolatum [71]. The core enzymes involved in the MVA pathway are listed in Table 4. All enzymes are encoded by a single gene, except isoprenyl transferase (undecaprenyl diphosphate synthase), which is encoded by two gene copies. Isoprenyl transferase catalyses the condensation of isopentenyl diphosphate (IPP) with allylic pyrophosphates, generating different types of terpenoids [72]. Farnesyl-diphosphate farnesyltransferase (squalene synthase) is a precursor of steroids, cholesterol, sesquiterpenes, farnesylated proteins, heme and vitamin K12 [73].

Bacteriocin-Related Genes
Bacteriocins are antimicrobial peptides ribosomally produced in bacteria, either processed or not by additional post-translational modification (PTM) enzymes, and exported to the extracellular medium [74,75]. Of the three genomes analysed, only the genome of strain ICIS 9 had one area of interest (AOI) that included genes encoding a bacteriocin of the class Sactipeptide ( Figure 2 and Supplementary Table S3). All enzymes are encoded by a single gene, except isoprenyl transferase (undecaprenyl diphosphate synthase), which is encoded by two gene copies. Isoprenyl transferase catalyses the condensation of isopentenyl diphosphate (IPP) with allylic pyrophosphates, generating different types of terpenoids [72]. Farnesyl-diphosphate farnesyltransferase (squalene synthase) is a precursor of steroids, cholesterol, sesquiterpenes, farnesylated proteins, heme and vitamin K12 [73].

Bacteriocin-related Genes
Bacteriocins are antimicrobial peptides ribosomally produced in bacteria, either processed or not by additional posttranslational modification (PTM) enzymes, and exported to the extracellular medium [74,75]. Of the three genomes analysed, only the genome of strain ICIS 9 had one area of interest (AOI) that included genes encoding a bacteriocin of the class Sactipeptide ( Figure 2 and Supplementary Table S3). Sactipeptides are a new class of synthesized in ribosomes and post-translationally modified peptides (RiPPs). Sactipeptides are known as antibiotics with narrow spectrum capable to inhibit Clostridia and some human multidrug-resistant bacterial pathogens [76]. The revealed features allow to consider sactipeptides promising scaffolds for the creation of new antibiotics [77]. The presence of an AOI encoding sactipeptide in the genome of Sactipeptides are a new class of synthesized in ribosomes and post-translationally modified peptides (RiPPs). Sactipeptides are known as antibiotics with narrow spectrum capable to inhibit Clostridia and some human multidrug-resistant bacterial pathogens [76]. The revealed features allow to consider sactipeptides promising scaffolds for the creation of new antibiotics [77]. The presence of an AOI encoding sactipeptide in the genome of strain ICIS 9 suggests that this strain may produce the sactipeptide. However, this assumed feature should be checked further through isolation and characteristics of this peptide.

Nutrient Synthesis (Vitamins and Essential Amino Acids)-Related Genes
Microorganisms that colonize various biotopes of the human body form multi-species communities and represent a kind of "organ", which, in turn, affects the functioning of all organs and systems that play an important role in maintaining the health of the host. Using intestinal microbiota as an example, commensal bacteria have been shown to be important sources of vitamins and amino acids. In addition to their nutritional/physiological properties, many of these vitamins are also involved in the development and functioning of host immune cells, as there is a direct link between biosynthetic biosynthesis intermediates derived from commensal bacteria and immune cells that directly recognise them [80,81]. The biosynthesis of vitamins and essential amino acids by probiotic strains has recently been an important aspect in the development of probiotic products and pharmaceuticals [82,83]. The genomes of strains ICIS 5, ICIS 9 and ICIS 53 contain functionally active biosynthetic gene clusters that encode all the enzymes required for the synthesis of B vitamins such as B2 (riboflavin), B6 (pyridoxin), B7 (biotin), B9 (folate) and B12 (cobalamin) ( Table 5), and essential amino acids such as histidine, arginine, methionine, threonine, lysine, leucine and tryptophan (Table 6).

Antibiotic Resistance-and Virulence-Related Genes
We searched the genomes of vaginal isolates of C. amycolatum strains for antibiotic resistance genes and found genes encoding resistance to antibiotics in only two strains, ICIS 5 and ICIS 9. The genome of strain ICIS 5 contained genes encoding resistance to chloramphenicol and aminoglycosides (Supplementary Table S4). The genome of strain ICIS 9 contained genes encoding resistance to macrolides, lincosamide, streptogramin, tetracycline, chloramphenicol and aminoglycosides (Supplementary Table S4). The results confirmed the antibiotic resistance profile of these strains, which was determined earlier with a disc susceptibility assay [84]. VFDB software was used to predict virulence factors in ICIS 5, ICIS 9 and ICIS 53. VFDB predicted 27 virulence factors in ICIS 5 and ICIS 9 and 23 virulence factors in ICIS 53 (Supplementary Table S5). The virulence genes of ICIS 5, ICIS 9 and ICIS 53 can be classified into ten categories: adherence, iron uptake, regulation, amino acid and purine metabolism, antiphagocytosis, cell surface components, immune evasion, lipid and fatty acid metabolism, protease and secretion system. However, all identified genes were not true virulence factors; only the structural and functional characteristics of microorganisms of the genus Corynebacterium were determined, as well as adaptation to this ecological niche [85][86][87][88][89]. The true virulence genes, such as toxin-related genes (diphtheria toxin and phospholipase D) and haemolysin-related genes characteristic of well-known pathogenic corynebacteria, were not identified.

Phage Defense Systems
CRISPR-Cas modules are adaptive immune systems that are present in most archaea and many bacteria and provide sequence-specific protection against foreign DNA or, in some cases, RNA [90]. Of the three analysed genomes, CRISPR-associated sequence (Cas) systems were identified exclusively in the genome of strain ICIS 5. Type I-E CRISPR consists of 7 cas genes: cas1e (locus_tag: E7L51_RS06945), cas2e (E7L51_RS06950), cas3 E7L51_RS06915, Cse4 (E7L51_RS06930), cas5e (E7L51_RS06935), cas6e (E7L51_RS06940) and cas7e (E7L51_RS06930). The abortive infection (Abi) system is another property of phage resistance that can target different phases of phage development [91]. Abortive infection family proteins were identified in the genomes of strains ICIS 5 (locus_tag: E7L51_RS07100), ICIS 9 (BXT90_RS06835) and ICIS 53 (BGC22_RS10885). The presence of such phage defence systems in the studied strains probably reflects the exposure of these strains to phages in the vagina [92].

Conclusions
We presented a comparative study of draft genomes for three C. amycolatum vaginal strains based on the annotation and analysis of genes associated with the physiological functions and adaptation of these microorganisms under the specific conditions of vaginal microbiocenosis. The presence of resistance genes against acid and oxidative stress, which are specific features of the vaginal biotope, has been established. The genes responsible for the synthesis of essential amino acids and vitamins have been identified, demonstrating the involvement of the corynebacteria in host metabolism. The presence of genes associated with H 2 O 2 production and the absence of true virulence genes allows us to consider the studied strains as potential probiotics capable of preventing the growth of pathogens and supporting colonisation resistance in the vaginal biotope. Antibiotic resistance genes revealed in the strains can provide corynebacteria the ability to colonise the vaginal biotope under antibiotic treatment. In addition, the studied strains are biotechnologically promising based on the identified genes for the synthesis of secondary metabolites of polyketides (merochlorins A-D), terpenes and sactipeptides. The analysis of C. amycolatum genomes revealed common genes for all three strains; for example, genes encoding adaptation and survival in the vaginal environment, as well as unique genes determining strain specificity. The presence of an AOI encoding sactipeptide in the genome of strain ICIS 9 suggests that this strain may produce the sactipeptide. However, this assumed feature should be checked further through isolation and characteristics of this peptide. Antibiotic-resistant genes detected in C. amycolatum ICIS 5 and ICIS 9; Supplementary Table S5: Virulence-related genes detected in C. amycolatum ICIS 5, ICIS 9 and ICIS 53; Supplementary Figure S1: Proposed biosynthetic gene cluster of merochlorins A-D-like compound in the C. amycolatum ICIS 5. The most similar gene cluster from Streptomyces sp. CNH189 is shown, with related genes drawn in the same colour to highlight inter-cluster rearrangements.; Supplementary Figure S2 Funding: This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.